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ABSTRACT 

We investigate the clustering properties of 13 QSO lines of sight in flat space, with 
average redshifts from z 2 to 4. We estimate the 1-D power spectrum and the 
integral density of neighbours, and discuss their variation with respect to redshift and 
column density. We compare the results with standard CDM models, and estimate 
the power spectrum of Lyman-a clustering as a function both of redshift and column 
density. We find that a) there is no significant periodicity or characteristic scale; b) the 
clustering depends both on column density and redshift; c) the clustering increases 
linearly only if at the same time the HI column density decreases strongly with redshift. 
The results remain qualitatively the same assuming an open cosmological model. 



1 INTRODUCTION 



As new and deeper galaxy redshift surveys are being com- 
pleted, a definite picture of the local galaxy distribution is 
slowly emerging. This distribution is characterized by large 
voids surrounded by sheets of clustered matter (see e.g. El- 
Ad et al. 1996a, 1997), and can be well quantified by statis- 
tical descriptors like the power spectrum (see e.g. Park et al. 
1994, Tadros & Efstathiou 1995, Tadros & Efstathiou 1996, 
Lin et al. 1996). In a few years, surveys like the Sloan Dig- 
ital Sky Survey (Loveday 1996) and the Two Degree Field 
Redshift Survey (Colless 1998, Maddox 1998) will extend 
our knowledge of the clustering of the luminous matter by 
at least a factor of ten both in depth and coverage. 

However, mapping the luminous galaxies is clearly not 
enough to fully understand the history and geography of our 
Universe. First, even the deepest surveys so far planned do 
not reach beyond awl. Secondly, we already know that 
a prominent component of the matter, perhaps a dominant 
one, is not bright enough to be included in the present sur- 
veys. Ranging from truly dark matter, to dwarf galaxies, to 
very low surface brightness galaxies, to Lyman-a clouds, the 
systems which can easily escape detection are many and var- 
ied. Their importance in solving crucial questions, such as 
whether the voids traced by bright galaxies are really empty 
of matter (Szomoru et al. 1994, Stocke et al. 1995), which 
are the paths of galaxy formation, what is the role of galaxy 
interactions, cannot be underestimated. 

In this respect, the large number of absorption lines 
seen in quasar spectra, the so called Lya forest, is an impor- 
tant tracer of the intervening matter distribution along the 
QSO lines of sight at any redshift in the observable Universe 
(0 S; z 5; 5). Even if the Lya forest is a one-dimensional dis- 
tribution, the fact that a line of sight can contain at high 
redshift (z ^ 2.5) up to a few hundreds of absorption lines 
makes the statistics significant. While galaxies are associ- 
ated with emitting objects, the Lya clouds represent the 



gas detected through the absorption of HI and are not nec- 
essarily associated with stars. In this sense they are matter 
tracers different from any other. 

What is the fraction of baryonic matter in the IGM is 
still matter of controversy. Hydrodynamical simulations of 
the Lya forest (Rauch & Haehnelt 1995, Miralda-Escude 
et al. 1996, Rauch et al. 1997, Weinberg et al. 1997, Bi & 
Davidsen 1997) indicate that this fraction is large for redshift 
2^2 (fib,iGM/i 2 ~ 0.01 — 0.02) and collapsed objects in 
the young Universe represent probably a small correction. 
However observations of the Lya forest with column density 
in the range 12.8 < log N m < 16.0 by Kim et al. 1997 using 
Keck/HIRES QSO spectra give D. b ,i G Mh 2 < 0.01 in 2.1 < 
2 < 3.5. The discrepancy is probably due to the fact that in 
the simulations the (photo and collisional) ionisation of the 
IGM is assumed to be higher than that assumed by Kim et 
al., who neglected collisional ionisation because they derived 
an upper limit to the temperature of the gas T < 10 5 K from 
the measured Doppler parameter of the Lya lines. 

The study of signatures in the distribution of Lya lines 
has been performed mainly using the traditional tool for 
the galaxy distribution: the two point correlation function 
(TPCF). While it has been clear for many years (Sargent 
et al. 1980) that low resolution spectroscopy was unable to 
give a definitive answer to the clustering of the Lya forest 
for scales of Av < 300 km s _1 (no signal was found for larger 
scales), the advent of high resolution spectroscopy in the last 
few years has started a more controversial discussion on the 
presence of clustering at small scales, and on the evolution 
of the same with redshift. A weak signal has been found on 
small scales (50 < Av < 300 km s" 1 , Webb 1987, Rauch 
et al. 1992, Chernomordik 1995, Cristiani et al. 1995). More 
recently, a detection of redshift evolution of the TPCF, being 
stronger at lower redshift, in a large sample of high column 
density Lya lines (log Nm > 13.8) (Cristiani et al. 1997, 
Savaglio et al. 1999) has been only marginally confirmed 
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by a similar study by Kim et al. (1997), keeping open the 
question whether the Lya forest does cluster and at which 
scales. This controversy leads to the natural consequence 
that it is necessary to find other tools of investigation for the 
same problem since the TPCF is neither the only statistical 
technique nor necessarily the most appropriate. Pando & 
Fang (1996) have shown that the scaling properties of the 
Lya forest can be studied using the wavelet decomposition 
analysis. The results are that clustering is present up to 
large scales (20 h" 1 Mpc) and that the 1-D power spectrum 
is significantly different from a Poissonian distribution. The 
box-counting technique (Carbone & Savaglio 1996, Savaglio 
& Carbone 1997) has the advantage of avoiding boundary 
effects, and has given a positive signal evolving with redshift 
(up to about 3 /i -1 Mpc at z ~ 3.8 and up to about 20 h~ x 
Mpc at z ~ 2). The Lya forest has been studied also by 
Hui et al. (1997), by deriving the expected column density 
distribution given a theoretical power spectrum. 

We will investigate the distribution of the Lyman-a 
clouds by two different, but related, descriptors: the 1-dim 
power spectrum, and the density of neighbours, or condi- 
tional density. The former is particularly suited to detect- 
ing preferential scales in the distributions, since it decom- 
poses the system in plane waves. Here "preferential scale" 
means simply a wavelength whose Fourier coefficient is sig- 
nificatively different than expected assuming a Gaussian dis- 
tribution of the Fourier coefficients. A preferential scale of 
clustering has sometimes been claimed in the galaxy dis- 
tribution (Broadhust et al. 1991; Einasto et al. 1997), so 
that it is interesting to see if something similar holds for 
the Lyman-a clouds. The 1-dim power spectrum is however 
intrinsically a very noisy quantity, since the Fourier coeffi- 
cients are statistically independent quantities in the limit 
of a Gaussian field. The main advantage is that, as a con- 
sequence of the statistical independence, their statistics is 
straightforward. On the other hand, the conditional density 
is an integral quantity, because it counts the average density 
of clouds within a distance R of another cloud; its signal is 
stable, but its statistics can only be approximated via Mon- 
teCarlo simulations. The conditional density has a reduced 
scatter with respect to the TPCF, and is therefore easier to 
compare to theoretical models, although the information on 
specific scales is spread over a large interval. 

Once the analysis of the dataset is completed, we can 
compare it to theoretical models of the density fluctuation 
field. The primary goal is to derive the Lyman-a power spec- 
trum in the linear regime as a function both of redshift and 
column density. We will therefore derive a biasing function 
i>L ya (z, Nhi), defined as the ratio of the cloud spectrum to 
a theoretical CDM spectrum at scales large enough to be in 
the linear regime of gravitational clustering. Since we will 
consider clouds at z > 2, we can assume safely that comov- 
ing scales larger than 3 Mpc/h are linear, i.e. the variance 
in spheres of 3 Mpc/h at z > 2 is less than unity. Although 
we will report data even at lower scales, all the relevant nu- 
merical results are obtained using only scales larger than 3 
Mpc/h. For simplicity, we normalize the CDM spectrum to 
the value that matches the present day cluster abundance, 
i.e. a 8 (z = 0) « 0.6!^" 6 (White, Efstathiou & Frenk 1993); 
a different choice of erg, for instance normalizing to the mi- 
crowave background fluctuations or to the present galaxy 
clustering, simply rescale b 2 jya (z, Nhi) by a constant fac- 



tor. Since at the high redshifts of our data the space-time 
geometry has an important effect, we test the dependence 
of our results considering in the following two values of the 
cosmological density: the inflationary value fio — 1 and the 
observationally preferred fio = 0.4. The final product is the 
cloud power spectrum approximated as 

P hya (k,z) = bl ya (z, N H i)P c dm{k, z) (1) 

where P c dm(k, z) is a CDM spectrum normalized to <T8 = 
0.6fi ( 7 ' 6 , evolving with z as D+(fio,z) 2 , where D+(Slo,z) 
is the fluctuation growth function in the linear regime. We 
adopt the approximation (Padmanabhan 1993) 



l + ffio + §fi z 

which is valid for fio > 0.1 and A = (the effect of a fi- 
nite A is actually negligible on D+). For fio — 1 this gives 
(1 + z)^ 1 as expected. The biasing function &L ya expresses 
therefore the evolution of the clustering of the clouds both 
versus redshift and column density. One can disentangle the 
evolutionary path in z and Nhi only by means of further 
assumptions. Although this task is left to future work, we 
obtain the preliminary result that the Lya clustering evolves 
linearly only if at the same time the average column density 
decreases strongly with time. 

Our method is based on the parametric estimation of 
the power spectrum of fitted absorption lines. A completely 
different method has been proposed in a recent work by 
Croft et al. (1998). They can successfully recover both the 
shape and the amplitude of the linear power spectrum of 
initial mass fluctuations (assumed to be Gaussian) from 
the one-dimensional power spectrum derived from QSO 
Lyman-a spectra, artificially created by realistic hydrody- 
namic cosmological simulations. It is premature to compare 
the results of the two approaches, also because Croft et al. 
apply the method to only one observed QSO line of sight; 
the results on the spectrum of Q1244+231 at z m 3 are 
consistent with a standard CDM model with normalization 
erg = 0.5. This conclusion applies also to our findings, al- 
though we find a very marked dependence on redshift and 
on column density. 



2 THE SAMPLE 

The sample contains 13 absorption line lists of QSO lines 
of sight collected from the literature, for a total of 2450 
Lya absorption lines in the redshift range 1.68 < z < 4.04 
(in the following we refer to lines meaning always Lya ab- 
sorption lines) . These line lists have been obtained by means 
of high resolution spectroscopy (6.6 < FWHM < 18 km s _1 ) 
and have no observational gaps in the Lya forest. The re- 
gion of the proximity effect (closer than about 8 Mpc from 
the source) has been excluded for completeness. We also ex- 
cluded from the sample those systems which are known to 
be metal systems. In Table ^] we give details and references 
of the sample used. For one object (QSO 1 and l b in Ta- 
ble |l|), we compared two line lists given by two different 
groups (Lu et al. 1997, Savaglio et al 1997) in order to un- 
derstand whether any significant difference between the two 
distributions is connected to different absorption line anal- 
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Table 1. The QSO absorption line sample 
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1 Lu et al. (1996), 2 Savaglio et al. (1997), 3 Cristiani et al. (1997), 4 Cristiani et al. (1995), 5 Hu et al. (1995), 6 D'Odorico et al. 
(1997), 7 Rodriguez-Pascual et al. (1995), 8 Khare et al. (1996), 9 Carswell et al. (1991), 10 Kulkarni et al. (1996) 



ysis approaches; in the analysis we always use the sample 
1. 

To investigate the dependence of the clustering on the 
column density and on the redshift, we group the 13 lines of 
sight in various subsets. First, we consider separately the six 
lines of sight with higher average redshift (average redshift 
< z >~ 3.28 ; numbers 1 — 6 in Table |l|), and the seven 
systems with lower average redshifts ( < z >~ 2.39; num- 
bers 7 — 13 in Table [l]). We will refer to these two sets as 
the higher and the lower redshift sets, respectively. Then, we 
group the 13 QSO lists in six sets of four lines of sight each 
(the sets are partially overlapping), so as to obtain average 
redshifts ranging from 2.05 to 3.47, as detailed in Table El 
For instance, the number-of-neighbours statistics for the set 
a of Table ^ is obtained averaging over the QSO lines of sight 
0000-26,1208-26,0055-26 and 2355+01. Finally, we consider 
for each of the sets the Lyman-a lines with column density 
above a certain threshold, log Nhi > 13.3, 13.5, 13.8, 14. In 
all, we obtain 30 different, although partially overlapping, 
subsets of clouds with various values of cut-off in the col- 
umn density, and various average redshifts. This grid of sets 
will be exploited to get the full, bidimensional variation of 
clustering with column density and redshift. 



3 THE METHOD 

Let Zi,i = 1, ..n be the redshift of the i-th Lyman-a cloud 
along a line of sight, and let rji be its mass. We will later 
assume a unitary mass for the clouds. The Fourier transform 
of the cloud distribution is then 



Table 2. Data subsets 
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ff - 1 ^(l + *) : 

is the cloud distance, L is the total length of the cloud sys- 
tem, p(r) is the mass density, po its average. We can put 
P( r ) = ^iViS i r i)y where 8 D (ri) is a Dirac delta func- 
tion. It follows that the total mass along the line of sight is 
M — ^r/i , and that po = M / L — rji / L — rjon/L, where 
T)o = rjj /n is the average mass. Then 



8k — n 1 ^ (Vi/Vo) exp(ifcri 



where 



exp(ikr)dr : 



i 



(5) 



(6) 



is the window transform. The window integral is approxi- 
mated in Nb equally spaced bins, which should be taken as 
small as possible, down to the resolution limit of the data. 
Finally, the power spectrum is denned as 



where 



P(k) =L\5k\ 



(7) 
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Figure 1. The average power spectra (in units of the noise) of the Lyman- a logNni > 13.8 (right panels) and logNni > 13.3 (left 
panels) for the high and low redshift samples. The dotted lines are the one and two sigma Poisson errors. The abscissa is the wavelength 
A = 2n/k. 
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10 100 
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By this definition, we have -P(O) = 0. If the clouds are un- 
correlated, then on average (*7j?7ie lfc ' ri ~ rj ') = for i ^ j; 
therefore, we get 



(8) 



and then we obtain the pure shot-noise power spectrum P s , 
in the limit of an infinite sample (i.e. for Wk = 5 D (k)) 



P S n = L^r ?J 2 /(^r ?J ) 2 



(9) 



We derived the equations including the mass, but in the 
following we put rjj — 1 (from which P sn = L/n) because 
the relation between the observed HI column density and 
the mass is dominated by poorly known parameters like the 
cloud size and shape. As a simple test, we arbitrarily as- 
sumed rjj ~ \ogNHi and found no qualitative difference. We 
will evaluate P(k) at kj = 2irj/L,j = l,..Nt/2, so as to 
obtain Nb/2 coefficients P(kj). 

The Fourier coefficients 5k, being linear combinations 
of many independent random variables r]i exp(ifcri), tend to 
be distributed as Gaussian variables. The power spectrum 
coefficients, sum of the square of two Gaussian variables, are 
distributed as a % 2 distribution with two degrees of freedom, 
i.e. as an exponential 



D(Pj) = (l/Psn)exp{-Pj/P m ) 



(10) 



If the central limit theorem conditions are not fulfilled, e.g. 
the fc-modes are correlated or the data are strongly non- 
Gaussian, the distribution of Pj deviates from the exponen- 



tial. To estimate the noise level P sn for a given line of sight, 
we generate one thousand simulated distributions of lines, 
distributing randomly the lines. Then, we estimate Psn by 
fitting the distribution of power spectrum coefficients with 
the exponential ( |lo| ) , and average over the realizations. The 
agreement with the expected value ( ^) is excellent. From 
Eq. (HOP we see that the variance of P Bn is 



2 



(P - Psnf 



exp(-P/P sn )dP : 



p 

J sn 

(a; — l) 2 exp(— x)dx = P 2 n 



(11) 



However, this gives only an idea of the spread of the spec- 
trum coefficients around the noise. To be more quantitative, 
we compare the cumulative distribution function (CDF) of 
the spectrum coefficients Pj — L|dfe 4 [ 2 with the CDF of the 
simulated distribution. The noise CDF is 



C(> P) 



D{P')dP' = exp(-P/P sn ) 



Therefore, a peak of height P = cP sn among n v peaks is 
statistically significant if n v exp(— c) is very low, e.g. 10 -3 . 

The one-dimensional power spectrum is plagued by the 
aliasing problem stressed by Kaiser & Peacock (1991): spuri- 
ous high signal-to-noise peaks may be induced by the 1-dim 
geometry. However, since we compare the Lycv spectrum to 
random realizations with the same geometry, and not to a 
theoretical significance level, our method is able to assess 
the probability that random realizations produce a given 
spectrum peak. 
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The second quantity we use is the normalized density 
of neighbours, i.e. 

Pn = (p(R)) n = (Y / V 1 /R)/p P (12) 

where the sum extends to all the lines within a distance R 
of a given line assumed as a center, and the average extends 
over all the centers, and where pp = n/L is the expected 
Poisson density. At the scales at which a clustering signal is 
detected, p n > 1. 

Evaluating p n the following problem arises. Let us order 
the Lyman-a clouds along a line of sight on the r axis, with 
increasing distance r. Consider the Lyman-a clouds within 
a distance R from the i-th cloud at position n: there are 
clouds on the right, at Ti < r < Ti + R , and clouds on 
the left, at ri ~ R < r < Ti. Suppose now the left set, for 
instance, is cut by the boundary of the sample. We would 
count the clouds contained in it in p(R) even if, in fact, the 
left set contains lines only up to a distance smaller than R. If 
the clustering decreases with scale, as expected, the left set 
cut by the boundary would have on average a higher p(R) 
than if it were not cut. In other words, including sets cut by 
the boundary amounts to averaging different scales in p(R). 
This would result in a systematic boundary effect, an effect 
which increases for large radii and for large clustering. To 
avoid this, we include only the clouds in sets which are not 
cut by the boundary: that is, around each cloud, we take 
only the right set if the left one is cut, and vice versa. 



Let us write down here the relations between the two 
quantities introduced, the power spectrum and the density 
of neighbours, and the relation between these and the 3-dim 
power spectrum Pa(fc). We have 

p(k) = — / P 3 (k')k'dk' (13) 
2yr Jk 

p n = l + ±- ( P(k)W(kR)dk (14) 
i7r Jo 

where W (x) is the I-dim top-hat window function, W(x) = 
sin(a;)/x. 

The theoretical spectra are in real space. As is well 
known, the peculiar velocities suppress the power of the real 
space spectrum at small scales and enhance it at large scales. 
The redshift space spectrum P 3a (k) is then a function of the 
real space spectrum P 3r (k), of the factor /3 = Qq 6 , and 
of the cloud velocity dispersion a v along the line of sight. 
The overall effect can be modeled according to the following 
formula derived by Peacock & Dodds (f994): 

Pss(k) = P 3r (k)G({3,y) 
G((3,y) = ^^[3/3 2 + W + 4y V 

5^^[/3 2 (3 + 2 y 2 )+4^ 2 ] (15) 

where y — ktJvHg 1 . We will investigate scales from 1 to 
100 Mpc/h. The lower limit of the scale is close to the typ- 
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Figure 3. Theoretical trend of the conditional density for various 
values of the parameters. The thick line is for a CDM model in a 
fiat space with shape parameter Y = 0.2. The dotted line includes 
the non-linear correction. The thin line assumes a higher value 
of the velocity dispersion. The short-dashed line is for T = 0.5, 
while the long-dashed one is for a open model. All models are 



normalized to erg = 0.6f2 n 



, and are evaluated at z = 3. Beyond 



3 Mpc/ft the non-linear correction becomes negligible. 



Figure 4. The circles represent the normalized integral density of 
neighbours for the seven high redshift cloud systems with various 
cut-offs in column density. The short-dashed and the long-dashed 
lines give the one-sigma and the two-sigma error band, respec- 
tively. The continuous lines are the linear CDM expectations, for 
r = 0.2, the lower curve, and T = 0.5, the upper curve. The up- 
per and the lower panels are for the seven high and low redshifts 
respectively. 
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ical transverse size of the Lya clouds, being of the order 
of a few hundred h^ 1 kpc with no strong redshift evolu- 
tion (Crotts et al. 1997, Smette et al. 1995). At scales be- 
low 1 or 2 Mpc/h it is very difficult to model accurately 
the Lyman-Q power spectrum, due to various effects: the 
line-of-sight velocity dispersion, the thermal broadening of 
the lines, the procedure of line fitting, the non-linear clus- 
tering. Croft et al. (1998) found that a correction factor 
exp(— .5fc 2 rJ) with r s — 1.5 Mpc/h may account empirically 
for all these small scale effects. Although we included some 
of the effects above mentioned to correct the spectrum at 
small scales, we work out the numerical results using only 
scales larger than 3 Mpc/h, where the corrections are small. 
We take for the velocity dispersion a v = 100 km s _1 : simula- 
tions give in fact peculiar velocities about 100 km s _1 or less 
(Miralda-Escude, 1996, Rauch et al. 1997, Bi & Davidsen 
1997), while observations of quasar pairs give velocity differ- 
ences between common absorption lines less than about 100 
km s _1 (Smette et al., 1995, Dinshaw et al., 1994, z ~ 2). 
Although the level of peculiar velocity is expected to vary 
with redshift, we fix its value to 100 km/sec both because 
the range of z we investigate is not very large, and because 
the correction is anyway important only at scales less than 
1 Mpc/h. 

Let us mention that the expected IGM spectrum devi- 
ates from the CDM spectrum at small scales also because 
of the clustering suppression below the IGM Jeans mass. 
This deviation can be accounted for by a term which de- 
pends essentially on the IGM temperature and redshift (Bi 
& Davidsen 1997). As long as T < 10 5 K (Kim et al. (1997) 
found that at z — 3 less than 19% of the Lya clouds with 
12.8 < log N H i < 16 can have T > 10 5 K), the scale at 
which a sensible deviation occurs is smaller than 0.5 Mpc 
hT 1 , for all relevant z. In the following we therefore neglect 
this factor. 
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4 POWER SPECTRUM RESULTS 

We calculated the PS for all our cloud systems, and averaged 
over the high and low redshift subsets mentioned in Section 
2. In this section we assume for simplicity fio = 1, since 
here we will not compare the results to a CDM model, but 
rather to Poisson distributions. We have tested that putting 
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Figure 5. As in Fig. 4b, now assuming an open universe with 

n = 0.4. 
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1 10 100 
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Qo = 0.4 in the distance-redshift relation does not mod- 
ify the conclusions. In Fig. |l| we show the power-to-noise 
ratios P(k)/P 3n for the average of the six higher and, sepa- 
rately, of the seven lower redshift QSO lines of sight in the 
range from 1 to 100 Mpc ft with cut-offs log Nhi >13.3 
and log Nhi >13.8 (the wavelength being 2n/k). Since we 
have expressed our spectra in units of the noise, we can take 
the average straightforwardly: the difference in line density 
due to evolution and to possible other observational effects 
is automatically taken into account. The result is that there 
is no evidence of deviation from the Poisson spectra, i.e. 
that there are no preferential scales, or "periodicity", in the 
clustering. Although there are a few isolated peaks above 
2a, they are not stronger and more frequent than one would 
get in a pure Poisson distribution. This is quantitatively re- 
ported in Fig. ^, where we display the cumulative distribu- 
tion of the peaks in all the 13 spectra at two column density 
cut-off: the CDF is always less than la from Poisson, even 
if we can have signals as strong as 9.5 times the noise (top 
left panel of Fig. |2|). The same holds true for all the other 
cuts in column density. 

We conclude that the power spectrum analysis of the 
distribution of Lya lines does not detect clustering on any 
particular scale. This does not necessarily mean, of course, 
that there is no clustering above Poisson; a suitable filtering 
of the data could in fact reveal a significant signal. However, 
a filtering in fc-space has not in general a direct geometri- 
cal meaning. It is therefore simpler to consider an integral 
statistics in real space, such as the integral density of neigh- 
bours, and to compare this statistics with the theoretical 
expectation. 



5 DENSITY OF NEIGHBOURS RESULTS. 

As one can see from the results of the previous Section, the 
1-dim power spectrum is extremely noisy, and cannot easily 
be compared with the theoretical predictions. To perform 
this comparison, we adopt then in this Section the integral 
density of neighbours. 

The theoretical linear CDM-like galaxy spectrum in real 
space can be written as 

P 3r ,c dm (fc, T) = AkT(k, T) 2 

where T(k, V) is the transfer function of Davis et al. (1985), 
F is a theoretical free parameter (equal to Qoh in CDM 
models if k is in h Mpc -1 ) and A is the normalization factor. 
We normalize putting the present variance in 8 Mpc h~ l 
spherical top-hat cells equal to the cluster abundance value 
0.6fig ' 6 (White, Efstathiou & Frenk 1993). We corrected 
the spectrum for the non-linear enhancement as suggested 
by Peacock & Dodds (1996), but we found that at scales 
larger than 3 Mpc/ft the effect is negligible. As anticipated, 
we will compare quantitatively our results to the data only 
at scales larger than 3 Mpc//i. We have to determine the 
expected one-dimensional spectrum from the 3D spectrum 
in redshift space, Pz a , c dm (fe) , and to evolve it back. This 
gives the spectrum 

P c dm{k,z) = D+(fl ,z) 2 k' dk' P:u,cdm{k') (16) 

J k 

Putting Pcdm in Eq. ( |l4| ) we can obtain the expected den- 
sity of neighbours. In Fig. ^ we report the theoretical density 
of neighbours for some interesting cases. Naturally, the ex- 
pression (|lrj ) is an oversimplification. Both the non-linear 
and the redshift corrections induce ^-dependent distortions 
on the spectrum. An accurate modeling of these distortions 
requires extensive A-body calculations. However, these dis- 
tortions are important only at small scales, where in any 
case the uncertainties in the estimation of the Lyman-a are 
a major hurdle to a precise comparison. 

According to the standard model of biased gravitational 
instability, and to the findings of Cristiani et al. (1996), we 
expect that the clustering increases with decreasing redshift, 
and with increasing column density. However to fully under- 
stand how the ID distribution of HI column density in QSO 
spectra represents the true 3D distribution of hydrogen in 
the Universe at different redshifts is not an easy task. Given 
an HI column density associated with an absorption line, 
the real mass of the Lya clouds depends on the ionisation 
correction Nhi /Nh in the cloud which gives the total hydro- 
gen column density, and on the geometry of the cloud. Both 
these quantities can be function of redshift due to the evolu- 
tion of the UV background flux J„, which photoionises the 
cloud and to the evolution of the gravitational instabilities 
in the clouds. To estimate a pure redshift evolution of the 3D 
power spectrum of the intergalactic HI, it is thus necessary 
to consider any redshift evolution of the Lya clouds not asso- 
ciated with gravitational clustering. Since this is not a trivial 
problem, in this paper we do not investigate the evolution 
of the clustering in redshift at a fixed value of the column 
density, but, as a preliminary task, we map the variation of 
clustering both with redshift and column density. We leave 
to future work the comparison of this evolution map with 
the theory or with A^-body and hydrodynamic simulations, 
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a necessary step in the reconstruction of the history of the 
IGM. 

We show in Figs. ^ the density of neighbours for several 
of our subsets. The error bands are Poisson errors estimated 
via MonteCarlo simulations; we adopted the Poisson errors 
since the level of cosmic variance is always negligible. For the 
set of the six highest redshift systems (< z >= 3.28) there is 
no detectable clustering, no matter what the cut in the col- 
umn density is. For the lowest redshift subset (< z >= 2.39), 
we see a clear increase in clustering with Nhi- In Fig. (gj) the 
same low redshift subset is shown, now assuming an open 
universe with Qo = 0.4. Comparing with the CDM curve, 
we see that the lines with log Nhi > 13.3 are near or above 
the CDM linear predictions, both for the flat and the open 
geometry. We also compared the results for the two line sets 
1 and l b listed in Table jjj; the results are compatible within 
the errors. It is interesting to observe that the CDM model 
is not an adequate fit for the low z, high column density 
data, independently of the bias factor. The normalized den- 
sity of neighbours is in fact always significantly less than 
unity at large scales (that is, there is large scale anticorrela- 
tion) contrary to the CDM model. We could not reproduce 
accurately this large scale feature even by varying the shape 
parameter V. 

In Fig. |^ we show the clustering trend in the six subsets 
of Table bl for log Nhi > 13.8, assuming Qo — 1. The clus- 
tering increases regularly from z ~ 3.5 to z ~ 2, although 
the increase is considerably faster at low z. Assuming as a 
reference the CDM predictions with F = 0.2 and linear evo- 
lution, we can estimate from the sets in Table ^ the biasing 
function 



bi ya (z,N H i) 



p n (z,< Nhi >) - 1 

pn,CDNl(z) — 1 

(k)/P cdm (k) 



which expresses for any given average < Nhi > and 
any given redshift, the relation between the real cluster- 
ing and the linear CDM predicted clustering (the average 
< Nhi > has been obtained excluding the very few lines 
with log H hi > 16). Clearly, &L yQ > 1 means that the data 
are more clustered than the linear-extrapolated CDM distri- 
bution; btya < 1 implies an under-clustering of the data. In 
Fig. we plot the function bf jya (z, Nhi)- As already men- 
tioned, the real evolution is likely to be both in column den- 
sity and clustering strength. If one assumes arbitrarily that 
the average column density does not change with time, then 
the evolution in redshift turns out to be extremely fast. In 
flat space, clustering growth as fast as (1 + z)~ z or faster 
seem to fit the trend we obtain for high column density 
cut-offs. One can obtain more reasonable values, i.e. values 
of closer to the linear trend only by assuming at the same 
time a decrease of the average column density with decreas- 
ing redshift. To give a rough approximation, mostly valid at 
redshift near 2, we can fit the function b-L ya as 



(N H i\ 0A * ( 1 + z V 
6L - = {—) {—J 



(18) 



where zi~3, and Ni = 7.1 • 10 atoms/cm if fio = 1, and 

as 



bhya = 



+ Z1 



(19) 



(17) 



where z\ is the same, and Nx = 9.7 • 10 14 atoms/cm 2 for 
Oo = 0.4. The relative error in the exponents is in all 
cases of the order of 30%. The bias function shows quan- 
titatively the clustering trend both in column density and 
redshift. As an interesting consequence, we see that the clus- 
tering evolves linearly, i.e. bhya is constant in redshift, only 
if N m ~ (1-M) 1 ' 770 -' 65 ~ (l + z) 2 ' 6±1 . (or as (1 + ^) 2 ' 3±1 for 
the open case). Notice that this conclusion is independent of 
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the overall amplitude of the bias function; it depends only 
on the relative level of clustering of the Lya distributions. 

Two caveats have to be kept in mind. First, our as- 
sumption of a scale-independent bias may prove incorrect. 
However, we used it as a simple way to parameterize the 
Lyman-a spectrum in terms of the CDM spectrum, and as 
such it looks, a posteriori, as an acceptable approximation. 
Second, the clustering evolution of the clouds need not be 
linear, even at the scales we are investigating. In fact, along 
with the linear evolution due to the gravitational instability, 
the clustering may evolve also in response to the evolution of 
the ionising sources. If, as shown e.g. by Bagla (1998), the 
evolution of the clustering of the first collapsed objects is 
far from linear at z > 1, then the Lyman-a clouds evolution 
could also be non-linear. 



6 CONCLUSIONS 

The aim of this paper was to reconstruct the power spec- 
trum of the Lyman-a clouds as a function both of redshift 
and average column density, and to compare it to theoreti- 
cal CDM models. We first estimated directly the line-of-sight 
power spectrum: we concluded that there is no evidence of 
characteristic scales in the cloud distribution, contrary to 
what is found in some galaxy surveys. Since the direct recon- 
struction of the power spectrum is too noisy to derive use- 
ful comparisons with theoretical expectations, we employed 
next the integrated density of neighbours. We assumed that 
the cloud spectrum can be fitted by 



Phy a {z,k) = bl ya (z,N H l)Pcdm(z,k) 



(20) 



where fc 2 ya characterizes the Lyman-a line bias with respect 
to a CDM model normalized to the cluster abundance. This 
fit performs acceptably well for scales between 1 and 10 
Mpc//i, while the data are systematically anticorrelated on 
larger scales. We found that the Lya clouds are clustered 
with power similar to the theoretical CDM only at red- 
shift near 2 and average column density higher than 10 13 ' 3 
HI atoms/cm 2 , being underclustered (6 2 ya < 1) in all the 
other cases. A reasonable expectation is that the cloud evo- 
lution is linear for most of the scales considered here. Impos- 
ing this condition upon the fits (|lij) and ( |l9| ) we conclude 
that the average column density must decrease in time as 
N H i ~ (1 + z) 2 ' 6 (or as (1 + z) 2 ' 3 for a tt = 0.4 universe). 
For the popular model with a cosmological constant such 
that Q,a = 0.6 the results of the open case applies to a 
good approximation, since Qa does not affect significantly 
the clustering evolution. Notice that the function &Lya de- 
pends on assuming the specific parametric form ( pc| ) and in 
particular on the clustering redshift evolution of the dark 
matter spectrum. However, the trend in z we find is so fast 
that it seems difficult to account for it without a strong 
redshift decrease of Nhi- 

It is clear that the dataset investigated here is too small 
to give definitive results. One of the problems is that the 
most recent and higher cut-off samples are also the most 
sparse, and thus give higher statistical uncertainty. One con- 
clusion is however rather firm: the cloud clustering can be 
explained only by assuming a simultaneous increase in corre- 
lation and decrease in average column density with redshift. 
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